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The present invention reiates ,o a computer system for conditional* carrying ou, an 
operation deflned in a computer instruction, to an execu«on un« tor use n 
computer system and to a method of executing instructions 



Background to the Invention 



Corner tems are knQwn ^ ^ ^ 

some value .dentified or addressed in the instruction. However, in these compute 
systems, the operation defined in the instruction is either carried ou, or no, carrild 

Computer systems are aiso known which ac on so-called packed operands. That is 

opetrTh d e7 PriSe ? P ' Ura,ay °' Pa ° ked *** ' n ~* ° f *• 
ZTh , ? ree PaCki " 9 Va,V f ° r 64 °™ » * '"own to 
rovdehy,e P ack,„ 3(eigh , objects pere4bi,operand,,ha lf wo,dpack,n g( four objects 
per 64 b„ operand, and word packing (two objects per 64 b* operand,. L j£ 
con.pu, s tems , when instruotas defimng 2 

the predion e,,her causes ,he operation to be carded ou, on a„ of , he operands 
or not to be carried out at all. "peranas 

Summary of th^ inwopfrn 

According t0 one aspect of (he ^ ^ ^ ^ ^ ^ 

use ,n a computer system for conditional carrying ou, an operation defined in a 
~mpu,er .nstrucion, the execution unit comprising: first and second input store 0 
hoid,ng respeCve fi re , and second operands on which an operation defined in the 
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instruction is to be carried out, wherein each store defines a plurality of lanes each 
holding an object; a plurality of operators associated respectively with the lanes for 
carrying out an operation specified in the instruction on objects in corresponding 
lanes of the first and second source operands; a destination buffer for holding the 
results of the operation on a lane-by-lane basis; and selecting means for determining 
for each lane in dependence on stored condition values whether or not the operation 
is to be executed on objects in that lane. 

The invention also provides a computer system including such an execution unit. 

The computer system can include a condition code register for holding said condition 
values in the form of a set of condition codes. Each condition code is preferably a 
multibit condition code. 

The computer system can also include a test register for holding a test code, the test 
register being addressed by the instruction. In that case, the test code can be 
compared with selected ones of the set of condition codes. 

The execution unit can include a condition code generator for generating the set of 
condition codes responsive to execution of an instruction. In the described 
embodiment, the number of condition codes in the set corresponds to the maximum 
number of lanes in the first and second source operands. When operands are 
packed to a lesser degree, condition code values are evaluated for each lane of the 
packed operand and then these values are used to set two or more condition codes 
so that each individual condition code in the set is generated regardless of the degree 
of packing of the source operand. In another embodiment, condition codes are 
generated only for each lane also for packed objects at less than the maximum 
packing density. 

Another aspect of the invention provides a method of executing instructions on 
operands containing a plurality of packed objects, the method comprising: accessing 
at least one source operand containing a plurality of packed objects in respective 
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lanes; accessing stored condition values to determine for each lane whether or not 
an operation defined in the instruction is to be implemented on that lane of the 
operand; and carrying out the operation and updating a destination operand only in 
those lanes for which the stored condition value indicates that the operation should 
be implemented. 

For a better understanding of the present invention and to show how the same may 
be carried into effect, reference will now be made by way of example to the 
accompanying drawings. 

Brief Description of the Drawings 

Figure 1 is a schematic block diagram illustrating a processor; 

Figure 2 is a diagram illustrating the encoding of two "packed" instructions; 

Figure 3 illustrates a number of different 32 bit instruction formats; 

Figure 4 illustrates the contents of a text register; 

Figure 5 illustrates the contents of a condition code register; 

Figure 6 is a diagram illustrating setting of condition codes; and 

Figure 7 is a diagram illustrating use of condition codes. 

Description of the Preferred Embodiment 

The computer system described herein is schematically illustrated in Figure 1 is a 
schematic diagram of the system. In Figure 1, reference numeral 2 denotes a 
program memory which holds programs in the form of a plurality of instructions. The 
program memory 2 is connected to an instruction cache 3 which is connected to 
instruction fetch/decode circuitry 4. The fetch/decode circuitry issues addresses to 
the program memory and receives on each fetch operation a 64 bit instruction from 
the program memory 2 (or cache 3). Each 64 bit instruction can define two 
operations or a single operation. The decode unit 4 evaluates the opcode and 
transmits the appropriate control signals along X and Y channels 5 x ,5 y . Each channel 
comprises a SIMD execution unit 8 x ,8 y which includes three data processing units, 
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MAC, INT and FPU and a load/store unit LSU 6. Each data processing unit MAC, 
INT and FPU and the load/store units LSU operate on a single instruction multiple 
data (SIMD) principle according to the SIMD lane expressed in the instruction 
according to the following protocol which defines the degree of packing of objects for 
packed data processing operations: 



(B) - 


8 bit objects (b 0 ... 


b 7 ) 


(H) - 


16 bit objects (h 0 ... 


h 3 ) 


(W) - 


32 bit objects (w 0 .. 




(L) - 


64 bit objects (I) 




(S) - 


32 bit floating point 




(D) - 


64 bit floating point 





For each channel 5 x ,5 y if the instruction defines a data processing operation it is 
supplied to the appropriate data processing unit MAC, INT or FPU and if it defines 
a load/store operation it is supplied to the load/store unit LSU. Data values are 
loaded to and from the MAC data processing units into and out of a common register 
file 10 which includes sixteen 64 bit special purpose registers along register access 
paths 12 XJ 12 y . Data values are loaded to and from the INT and FPU data processing 
units and the load/store units LSU into and out of a second register file 1 1 which 
includes sixty-four 64-bit general purpose registers. Register access paths 14 x ,14 y 
are provided for these accesses. The program counter PC which indicates the 
current instruction can be read via one of the general purpose registers (the PC 
register). Another one of the general purpose registers constitutes a control and 
status register. The general purpose registers also include a condition code register 
(CCreg) and a test register TSTreg which are discussed in more detail in the 
following. 

Each register access path 12,14 carries three addresses from the accessing unit, two 
source addresses SRC1,SRC2 and a destination address DST. In the case of data 
processing operations, the source addresses SRC1,SRC2 define registers in the 
register files 10,11 which hold source operands for processing by the data 
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processing unit. The destination address DST identifies a destination register into 
which a result of data processing will be placed. The operands and results are 
conveyed between the register file 10 or 11 and the respective data processing unit 
via the access paths 12,14. In the case of load/store operations, the instruction 
formats allow memory access addresses A x ,Ay to be formulated from data values held 
in the registers as described in our copending Application No. (PWF Ref: 92472) as 
described later. The load/store units access a common address space in the form 
of a data memory 16 via a dual ported data cache DCACHE 15. For this purpose, 
each load/store unit has a 64 bit data bus Dx.Dy and a 64 bit address bus Ax.Ay. 

The architecture supports two types of instruction, so called "long instructions" and 
"dual operation instructions". Each 64 bit dual operation instruction defines two 31 
bit operations in the manner illustrated in Figure 2. That is, each 64 bit sequence 
contains two 32 bit instruction portions labelled INST1 and INST2. Each instruction 
also includes a designated set of identification bits which identify the type of each 
operation. There are two such bits in Figure 2 at bit locations 31 and 63 denoted ID1 
and ID2 in Figure 2. In this embodiment operations are divided into data processing 
(DP) types or a load/store (LD/ST) types. The allowed combinations are two data 
processing operations (ID1,ID2 both set to "0"), two load/store operations (ID1.ID2 
both set to "1") or one data processing and one load/store operation (ID1=0,ID2=1). 

Figure 3 illustrates a number of basic formats for the 32 bit data processing portions 
of dual operation instructions. Each 32 bit operation requires the resources of one 
load/store unit, or one or more of the SIMD data processing units (MAC, INT, FPU) 
to execute. Thus, normally, two 32 bit operations can execute simultaneously, one 
in each of the X and Y channels. In the following, the X and Y channels are 
sometimes referred to as the right and left sides of the processor. 

Format (1 ) defines register-register operations. Format (2) defines register-immediate 
operations. Format (3) defines a memory access operation. Memory access 
operations are not discussed further herein but it is noted that they are used to load 
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data from memory into the register files and to provide store operations for the 
reverse. 

The data processing instruction formats have the following fields in common. There 
are three identification bits, M (bit 31) and G (bits 29 and 30) which denote the nature 
of the data processing operation. An opcode field (bits 22 to 28) which defines the 
nature of the operation to be carried out, a condition setting field S (bit 21), a test 
register field (bits 18 to 20) and fields defining two source registers Src1,Src2 and a 
destination register Dst. Of course, format (2) does not include a field identifying a 
second source register, but instead holds an immediate value. As mentioned above, 
the registers in the register files 10,11 can hold a number of packed objects and in 
that case, normally, the operation defined in the opcode of the instruction will be 
carried out on each "lane" of the operand, that is on each pair of corresponding 
packed objects in respective source registers Src1,Src2 or source register and 
immediate value as the case may be. 

As mentioned above, the general purpose registers 1 1 include CC registers and test 
registers. The test register holds a plurality of test bytes with each byte having a 
format as in Figure 4, and the least significant byte set to zero. Each of these bytes 
will be referred to as a Treg byte [1 ... 7]. The Treg bytes are programmable and 
are set to the required values prior to instruction execution. The test register is used 
to allow conditional execution of instructions. Each instruction format contains a 3 bit 
TST field which allows a Treg byte [1 to 7] to be specified. If a Treg byte is not 
specified in the instruction, the TST field in the opcode is set to 0, and the instruction 
executes unconditionally. A specified byte can be modified using an ADL instruction 
of format (2) which adds a byte value specified in the immediate field to a specified 
SIMD lane defined in the TST field (Treg byte [1 ... 7]). 

The 8-bit field of each Treg is as shown in Figure 4. 

The Condition field (bits 0 to 3) applies to all predicated instructions. It holds a 4 bit 
test code to allow for conditions to be tested. As discussed in more detail later, for 
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instructions on packed objects, the condition applies to all the lanes on a per lane 
basis. 

The four condition flags are: 

N (Negative flag - bit 3) 
Z (Zero flag -- bit 2) 
C (Carry flag - bit 1) 
V (Overflow flag - bit 0) 

These four bits give rise to 16 test conditions (see Table 1). 

Table 1 



Opcode 


Mnemonic 


Meaning 


Flags 


0000 


AL 


Always execute 




0001 


EQ 


Equal 


Z 


0010 


NE 


Not Equal 


!Z 


0011 


CS 


Carry Set 


C 


0100 


CC 


Carry Clear 


!C 


0101 


Ml 


Negative 


N 


0110 


PL 


Positive or Zero 


!N 


0111 


VS 


Overflow 


V 


1000 


VC 


No Overflow 


!V 


1001 


HI 


Unsigned Higher 


C.!Z 


1010 


LS 


Unsigned Lower or Equal 


!C+Z 


1011 


GE 


Greater or Equal 


N.V+IN.IV 


1100 


LT 


Signed Less Than 


N.!V+!N.V 


1101 


GT 


Signed Greater Than 


!Z.(N.V+!N.!V 


1110 


LE 


Signed Less than or Equal 


Z.(N.!V+!N.V) 
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1111 



Reserved 



Reserved 



Reserved 



The CC-Selector field (bits 4 to 6) applies only to instructions on non-packed objects. 
The CC-Selector field designates the condition code (in the CC register discussed 
later) which is to be compared with the code in the Treg Byte. 



In the described embodiment, the Side field (S-bit 7, X-0, Y=1) applies only to non- 
SIMD 64-bit instructions. For 32-bit operations the side is decided according to the 
channel in which the instruction is being executed, irrespective of what the side field 
is set to. For non-SIMD 64-bit instructions, the side field designates either left or right 
(X or Y) condition codes in the CC register. 

The Condition code register is 64 bits long and the bit fields are defined in Figure 5. 
Each condition code is four bits long (a nibble). 



Each CC{X,Y}{0..7} nibble has the four condition flags (NZCV) in that order. The 
condition codes can be set and used on a per lane basis. 



SIMD byte (B) operations update 8 condition codes, X or Y depending on the side of 
the machine. 



CC{X,Y}0...CC{X,Y}7 



SIMD halfword (H) operations update four pairs of condition code as follows (again 
depending on the sides of the machine). That is, two condition codes are updated 
with the same value for each operation. 



CCX0=CCX1 
CCX2=CCS3 
CCX4=CCX5 
CCX6=CCX7 



CCY0=CCY1 
CCY2=CCY3 
CCY4=CCY5 
CCT6=CCY7 
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SIMD word (W) operations update two quads of condition codes, as follows 
(depending on the side of the machine). That is four condition flags are updated to 
the same value for each operation. 

CCX0=CCX1 =CCX2=CCX3 CCY0=CCY1 =CCY2=CCY3 

CCX4=CCX5=CCX6=CCX7 CCY4=CCY5=CCY6=CCY7 

SIMD longword (L) operations update all eight condition codes as follows (depending 
on the side of the machine). 

CCX0=CCX1=CCX2=CCX3=CCX4=CCX5=CCX6=CCX7 
CCY0=CCY1=CCY2=CCY3=CCY4=CCY5=CCY6=CCY7 

Use of the test register and condition code register will now be described with 
reference to the following examples. It will be understood that the following are 
examples only used to illustrate the principles of operation. 

Consider the arithmetic operation ADD. This instruction can be implemented in a 
register-to-register format using format (1) in Figure 3, or in a register-to-immediate 
format using format (2) of Figure 3. Only the register-to-register format is described 
in detail in the following to illustrate the principles of the use of the test registers and 
the condition code register. 

The instruction ADD has the following semantics: 

ADD{B | H | W | L}{S}{Treg,} Dest, Srd, Src2. 

Considering format (1) in Figure 3, this instruction thus defines two source registers 
in the fields SRC1 and SRC2, each source register containing operands for the 
addition operation. If the instruction is ADDL, the registers are not packed but each 
contains a 64 bit object. For the ADDB, ADDH and ADDW instructions, each source 
register contains a number of packed objects according to the degree of packing 
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defined earlier. It is assumed in the following example that each source register 
holds eight packed objects bO ... b7. A destination register DST is defined in the DST 
field of the instruction for holding the result of the addition operation. Eight packed 
objects are generated in the destination register. Assume for the time being that a 
single 32 bit operation is being effected by one channel (X or Y) of the processor. 

Figure 6 illustrates a diagram of one of the functional units which is capable of 
executing the ADD instruction. First and second 64 bit buffers 40,42 are provided for 
holding first and second source operands. In the illustrated example, the source 
operands contain eight packed objects b 0 to b 7 in each case. Addition circuitry 44 
comprises eight operators, in this case addition circuits, each connected to receive 
two packed objects respectively from the same lane b 0 ... b 7 of the input buffers 
40,42. Only the connections of the first and last objects are shown. Similar 
connections apply to each of the other addition circuits. An output buffer 46 holds the 
result which is likewise in the form of eight packed objects for loading into the 
destination register specified in the instruction. A condition code generator 48 also 
examines the results of each addition and generates condition codes for the side of 
the machine where the instruction is being executed to be held in the condition code 
register CCreg. The condition generator 48 always generates eight condition codes 
for each side of the machine regardless of the degree* of packing of the source 
operands, as described more fully in the following. 

The instruction format (1) contains an S flag (bit 21). If this flag is set, then the 
condition codes for the side of the machine in which the instruction is being executed 
are set. Assume in this example that the instruction is being executed on the X side 
of the machine. The condition code generator 48 generates condition codes CCXO 
to CCX7 by considering the results of the addition operations which were carried out 
on each packed object in the source registers and determining from those operations 
the values of N,Z,C and V which are the bits defining each condition code. In the 
example of Figure 6, a different condition code can be generated for each condition 
code location CCXO ... CCX7 in the condition code register. Condition codes are set 
as follows. The N flag of the condition code is set if the result is negative, that is the 
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top bit is set (event for unsigned inputs). The Z flag is set if the result is zero. The 
C flag is set if the operation generated an output carry. The V flag is set if the 
operation overflowed its range before being wrapped or saturated. 

If the S flag in the instruction is not set, the condition register is not updated. 

For the above explanation it has been assumed that the addition operation was 
carried out. It is possible to predicate the ADD instruction itself on condition codes 
which have been set by an earlier instruction or instruction sequence. For this 
purpose, the instruction format contains a test register field TST (bits 18 to 20). 
These three bits allow one of the test register bytes Treg (1..7) to be addressed. If 
a test register byte is specified in the TST field of the instruction, then each condition 
code on the side of the machine where the instruction is being executed is checked 
against the condition code specified in the addressed Treg byte. If the TST field is 
set to zero, such that no Treg byte is specified, then the instruction always executes. 

After comparing the test code specified in the addressed Treg byte with each of the 
condition codes CCXO ... CCX7 (assuming the operation is being executed on the X 
side of the machine), then the specified operation is carried out on the SIMD lanes 
where there is a match, and is not carried out on the SIMD lanes where there is no 
match. An example is illustrated in Figure 7. Assume that the operation illustrated 
in Figure 6 and described above has been carried out and that condition codes CCXO 
to CCX7 have been set as described above depending on the results of the arithmetic 
operation in each of the SIMD lanes b 0 ... b 7 . It is assumed for this example that the 
condition codes are: b 0 0010, b-, 0101, b 2 0011, b 3 0010, b 4 0010 b 5 0100. This is 
illustrated in the condition code register in Figure 7. Let us also assume that the 
addressed test register byte in the TST field of the instruction holds the condition 
code 0011. This denotes the condition Carry Set C. SIMD lanes b 0 ,b 2J b 3 ,b 4 satisfy 
this condition. Assume that the subsequent operation to be carried out is also an 
ADD instruction operating on the byte packed contents of two source registers 
SRC1.SRC2 with the results to be loaded into a destination register DST. Because 
a test register byte has been specified, the addition operation is only effected on the 
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SIMD lanes where the condition code set for that lane (CCXO ... CCX7) satisfies the 
condition defined by the test code set in the addressed Treg byte. This is determined 
by a condition code checker 50. The output of the condition code checker 50 controls 
a set of switches 52, one for each SIMD lane b 0 ... b 7 . These switches control 
whether or not the results of the addition operation recited in the instruction update 
the values in the corresponding lane in the destination register DST. This is shown 
diagrammatically in Figure 7, with a cross illustrating that the result of the addition 
operation does not get loaded into the destination register, and a through arrow 
illustrating that it does. This is denoted in the destination register by DST denoting 
an original byte in the destination register (that is prior to execution of the instruction), 
and RES denoting a result byte which, following execution of the instruction is a result 
of the arithmetic operation on that lane. 

As mentioned above, when an instruction is to be executed on less densely packed 
objects, for example H or W, eight condition codes are still set, in pairs or quads as 
mentioned above. Assume for example that an operation is carried out on halfword 
packed objects h 0 ... h 3 . The condition generator 48 determines condition code 
values by setting the N,C,Z and V flags as before for each SIMD lane. Thus, four 
condition code values are generated. These are used to set eight condition codes 
in the condition code register as condition code pairs as mentioned above. Once the 
condition codes have been set in the condition code register, they can be used in 
subsequent instructions operating on any degree of packing. For halfword packed 
objects, the first condition code CCXO and every alternate condition code thereafter 
is checked against the test code in the addressed Treg byte to determine whether or 
not the instruction executes on that SIMD lane. For word packed objects, the first 
condition code and the fourth condition code is checked against the test code in the 
addressed Treg byte to determined whether or not the operation should be executed 
on that SIMD lane. 

For long word operations, the first condition code CCXO is evaluated. As described 
above, all condition codes for one side of the machine will have the same value set. 
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For non-SlMD instructions, that is instructions where the operands are not packed, 
the condition codes can still be used to determine whether or not the instruction is 
executed. In that case, the TST field of the instruction again identifies a Treg byte. 
The condition selector in the Treg byte identifies one condition code which can be on 
the right or left sides. The side bit S denotes which side of the machine (X or Y). 
That is, any condition code out of the sixteen condition codes CCXO ... CCX7, CCYO 
... CCY7 can be identified by the condition selector. The code held in the condition 
bits 0 to 3 of the Treg byte is then tested against the selected condition code to 
determine whether or not the instruction is executed. If the test condition is satisfied 
the instruction is executed. If not, the instruction is not executed. 

From the above description it will be appreciated that although the processor has 
right and left sides (X and Y channels), the condition code register is accessible from 
both sides. For 32 bit operations, condition codes are set and accessed according 
to the side of the processor which is actually processing the 32 bit operation. Thus, 
32 bit operations processed in the X channel 5X update the condition codes for the 
X channel CCXO ... CCX7, and are executed conditionally only on the X channel 
codes CCXO ... CCX7. Similarly, for 32 bit operations being executed in the Y 
channel 5Y. However, 64 bit operations have greater flexibility. 64 bit instructions 
also include a TST field for accessing a Treg byte in a manner similar to the 32 bit 
operations discussed above. In that case, the side field S in the accessed Treg byte 
determines which condition codes (X or Y) are used. That is, the side field S 
determines both which condition codes are set pursuant to a 64 bit instruction being 
executed and, also, on which condition codes (X or Y) operation of the 64 bit 
instruction is predicated. Thus, this considerably increases the options available on 
the execution of 64 bit instructions. 

It will be understood that a number of variations are possible to the embodiment 
described above. For example, the Side bit S in the Treg byte can be used always 
to denote the required set of condition codes (X or Y), rather than only in non-SlMD 
64-bit instructions as mentioned above. Also, there is no need for a specific 
condition set flag S in the instruction - whether or not condition codes are to be set 
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can be defined in the opcode or the instruction semantics can be such that condition 
codes are always set. 

Considering Figure 7, it is possible to implement conditional lane-by-lane execution 
in a number of ways. Another alternative to that described would be to perform the 
operation (and power-up the respective operators) only when the test condition is 
satisfied for that lane. 



What is claimed is: 
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1. An execution unit for use in a computer system for conditionally carrying out 
an operation defined in a computer instruction, the execution unit comprising: 

first and second input stores for holding respective first and second operands 
on which an operation defined in the instruction is to be carried out, wherein each 
store defines a plurality of lanes each holding an object; 

a plurality of operators associated respectively with the lanes for carrying out 
an operation specified in the instruction on objects in corresponding lanes of the first 
and second source operands; 

a destination buffer for holding the results of the operation on a lane-by-lane 
basis; and 

selecting means for determining for each lane in dependence on stored 
condition values whether or not the operation is to be executed on objects in that 
lane. 

2. An execution unit according to claim 1 , wherein said condition values comprise 
a set of condition codes. 

3. An execution unit according to claim 2, wherein the selecting means comprises 
means for comparing selected ones of said set of condition codes with a test code 
identified in the instruction. 

5. An execution unit according to claim 2, wherein the number of condition codes 
in said set corresponds to the maximum number of lanes in the first and second 
source operands. 

4. An execution unit according to claim 2, which comprises a condition code 
generator for generating said set of condition codes responsive to execution of an 
instruction. 

5. An execution unit according to claim 2, wherein the number of condition codes 
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in said set corresponds to the maximum number of lanes in the first and second 
source operands. 

6. An execution unit according to claim 4, wherein the number of condition codes 
in said set corresponds to the maximum number of lanes in the first and second 
source operands and wherein the condition code generator is operable to generate 
the set of condition codes so that, when the operands have less than the maximum 
number of lanes, two or more condition codes are set to the same value so that each 
individual condition code in the set is generated regardless of the degree of packing 
of the first and second source operands. 

7. A computer system for conditionally carrying out an operation defined in a 
computer instruction, the computer system comprising; 

fetch and decode circuitry for fetching and decoding a sequence of instructions 

from a program memory; 

at least one execution unit according to any of claims 1 to 6; and 

at least one memory access unit for effecting memory access operations 

responsive to memory access instructions. 

8. A computer system according to claim 7, which comprises a condition code 
register for holding said condition values in the form of a set of condition codes. 

9. A computer system according to claim 7, which includes a test register for 
holding a test code, the test register being addressed by a computer instruction and 
said test code being used in comparison with said condition values to determine for 
each lane whether or not the operation is to be executed on objects in that lane. 

10. A method of executing instructions on operands containing a plurality of 
packed objects, the method comprising: 

accessing at least one source operand containing a plurality of packed objects 
in respective lanes; 

accessing stored condition values to determine for each lane whether or not 
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an operation defined in the instruction is to be implemented on that lane of the 
operand; and 

carrying out the operation and updating a destination operand only in those 
lanes for which the stored condition value indicates that the operation should be 
implemented. 

11. A method according to claim 10, wherein the step of accessing the stored 
condition values comprises accessing a set of condition codes held in a condition 
code register and comparing said selected codes with a test code identified in the 
instruction. 

12. A method according to claim 1 1 , wherein the test code is held in a test register 
which is identified by an address in the instruction. 
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Abstract of the Disclosure 

CONDITIONAL INSTRUCTION EXECUTION IN A COMPUTER 

A computer system is disclosed capable of conditionally carrying out an operation 
defined in a computer instruction. The computer instruction is implemented on so- 
called packed operands, that is operands containing a plurality of packed objects in 
respective lanes. An operation defined in the computer instruction is conditionally 
carried out in dependence on stored condition values which determine for each lane 
whether or not the operation is to be executed on objects in that lane. 

An execution unit for a computer system, a computer system and a method of 
executing instructions are defined. 



z 



2 



I 



»3U 



\ NT 




71 



10 



CrP 




i 7~ 



CPU., 



-7*- 



* A 

6^ 



, : 

h»\ I I I I | | [ i | i 1 t I l l i l i \ l- 1 I 1 M 1 I i I | loot 



'<\l 


f Q?osJbfe IS! tstI vnnfe^ 


1 S fce i. 1 




1 












1 



3> 



OA 



\t05Tl. 



"7 








\ 


O 


S 











Cocc| 



CCY7 



Ccvc2 



2> ^ 1 * 



6 



FILE NO.: 1073/0G1 16 



DECLARATION 
AND POWER OF ATTORNEY 
Original Application 

As a below named inventor, I declare that the information given herein is true, that I 
believe that I am the original, first and sole inventor if only one name is listed at 1 
below, or a joint inventor if plural inventors are named below, of the invention entitled: 

CONDITIONAL INSTRUCTION EXECUTION IN A COMPUTER 

which is described and claimed in: 

[ ] the attached specification or [X] the specification in application 

Serial No. 09/ , filed 9/1 3/99 

(for declaration not accompanying appl.) 

that I do not know and do not believe that the same was ever known or used in the 
United States of America before my or our invention thereof or patented or described 
in any printed publication in any country before my or our invention thereof, or more 
than one year prior to this application, or in public use or on sale in the United States 
of America more than one year prior to this application, that the invention has not been 
patented or made the subject of an inventor's certificate issued before the date of this 
application in any country foreign to the United States of America on an application 
filed by me or my legal representatives or assigns more than twelve months prior to 
this application, that I acknowledge my duty to disclose information of which I am 
aware which is material to patentability in accordance with 37 CFR §1 .56. I hereby 
state that I have reviewed and understand the contents of the above-identified 
specification, including the claims, as amended by any amendment referred to above. 

I hereby claim the priority benefits under 35 U.S.C. 1 19 of any application(s) for 
patent or inventor's certificate listed below. All foreign applications for patent or 
inventor's certificate on this invention filed by me or my legal representatives or 
assigns prior to the application(s) of which priority is claimed are also identified below. 



PRIOR APPLICATION(S), IF ANY, OF WHICH PRIORITY IS CLAIMED 



COUNTRY APPLICATION NO. DATE OF FILING 

United Kingdom 21 July 1999 



ALL FOREIGN APPLICATIONS. IF ANY. FILED PRIOR 



TO THE APPLICATION(S) OF WHICH PRIORITY IS CLAIMED 



COUNTRY APPLICATION NO. DATE OF FILING 



POWER OF ATTORNEY: 

As a named inventor, I hereby appoint the following attorney(s) and/or agents(s) to prosecute this application and transact all 
business in the Patent and Trademark office connected therewith: Gordon D. Coplein #1 9, 1 65, William F. Dudine, Jr. #20,569, 
Michael J. Sweedler #19,937, S. Peter Ludwig #25,351, Paul Fields #20,298, Harold E. Wurst #22,183, Joseph B. Lerch 
#26,936, Melvin C. Garner #26,272, Ethan Horwitz #27,646, Beverly B. Goodwin #28,417, Adda C. Gogoris #29,714, Martin 
E. Goldstein #20,869, Bert J. Lewen #19,407, Henry Sternberg #22,408, Robert A. Green #28,301, Peter C. Schechter 
#31,662, Robert Schaffer #31,194, Robert C. Sullivan, Jr. #30,499, Ira J. Levy #35,587, Joseph R. Robinson #33,448 

all of the firm of DARBY & DARBY P.C., 805 Third Avenue, New York, NY 10022 

SEND CORRESPONDENCE TO: DIRECT TELEPHONE CALLS TO: 

DARBY & DARBY P.C. MELVIN C. GARNER 

805 Third Avenue 

New York, NY 10022 212-527-7700 

FULL NAME AND RESIDENCE OF INVENTOR 1 

LAST NAME: WILSON FIRST NAME: Sophie MIDDLE NAME: 

CITY: Cambridge STATE OR FOREIGN COUNTRY: ENGLAND COUNTRY OF CITIZENSHIP: ENGLAND 

POST OFFICE ADDRESS: 6 Williow Grove, Lode CITY: Cambridge STATE OR COUNTRY: ENGLAND ZIP CODE: CB5 9EL 

I further declare that all statements made herein of my own knowledge are true and that all 
statements made on information and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false statements and the like so made 
are punishable by fine or imprisonment, or both, under Section 1 001 of Title 1 8 of the United 
States Code, and that such willful false statements may jeopardize the validity of the 
application or any patent issuing thereon. 



SIGNATURE OF INVENTOR 1: DATED: 



(D&DForms/PTO-21) 



REV. 12/87 



